This paper applied pat-tree structure to the chinese information retrieval field and proposed a new chinese search engine results clustering algorithms based on our modified pat-tree 本文将pat-tree应用于搜索引擎结果聚类领域,并在修改的pat-tree基础上提出了一个新的中文搜索引擎结果聚类算法。
4 ) the experiments of 4d-var and forecasts show that, the forecast skill of rain and other model variables are improved when the retrieval fields of amsu are assimilated into the initial condition by mm5 4d-var system 4)伴随同化预报试验表明:经mm5v1-4dvar系统同化后形成的初值对降水的预报及其它要素场的预报有一定的改善。
Therefore this article switches to the statistical natural language processing and seeks the best way to implement the above conclusion . finally a new language model named section language model in the information retrieval field is proposed 所以本文转向通过统计自然语言处理寻找一种实现上述结论的最佳方法,最终提出了基于段语言模型的信息检索实现模型。
By utilizing the idf ( inverse document frequency ) formula in automatic categorization process, which was used in information retrieval field to calculate the relativity term weight between keywords and relevant documents, and combining with analysis result of chinese web page, the formula carrying adjustable parameter for calculating the correlative degree is obtained . categorization correlative degree vector library, which is used to conserve categorization-training result, is designed and established to meet demands of the formula 并将信息检索领域中用于计算关键字与相关文献相关权重的idf(inversedocumentfrequency)公式应用于自动分类过程,结合对中文网页的分析结果,得出具有可调参数的权重计算公式,根据公式要求,设计并建立了用于保存分类训练结果的分类权重向量库。
To tackle this difficult problem, the topic focused crawling technology came into being and occupies a position with its high degree of specialization and objectives in search engine research and development of the next generation search engines . in addition to construct professional search engine, it can be used in a client-based real-time information retrieval system, and has become a new research hot in network information retrieval field 首先,在考虑了噪音的情况下,我们扩充锚文本的提取范围,形成“链接背景信息”定义;接着编写了自动提取算法;然后,在爬行程序运行之前,就大量收集起始页面地址的链接背景信息,并利用这些信息指导爬行程序的工作。
Content-based image rrtrieval ( cbir ) is a kind of retrieval, which derictly use content of an image for information search, it is one of most active researches in multimedia retrieval field . in order to analysis the informations included in an image, the cbir system always ultilize the color, texture, shape and other low level image features . to establish the feature vectors as retrieval index . in present time, the main cbir method is similiarity search based on multi-dimension feature vector of image 基于内容的图像检索(cbir,content-basedimageretrieval)是指直接采用图像内容进行图像信息查询的检索。基于内容的图像检索技术是当前多媒体检索研究的热点之一。基于内容的图像检索方法,其主要思想是根据图像所包含的色彩、纹理、形状以及对象的空间关系等低层图像特征来分析图像信息,建立图像的特征矢量作为其索引,检索方法目前主要是根据图像的多维特征进行相似查询。